Blame assignment for errors made by large vocabulary speech recognizers

نویسنده

  • Lin Lawrence Chase
چکیده

This paper describes an approach to identifying the reasons that speech recognition errors occur. The algorithm presented requires an accurate word transcript of the utterances being analyzed. It places errors into one of the categories: 1) due to outof-vocabulary (OOV) word spoken, 2) search error, 3) homophone substitution, 4) language model overwhelming correct acoustics, 5) transcript/pronunciation problems, 6) confused acoustic models, or 7) miscellaneous/not possible to categorize. Some categorizations of errors can supply training data to automatic corrective training methods that refine acoustic models. Other errors supply language model and lexicon designers with examples that identify potential improvements. The algorithm is described and results on the combined evaluation test sets from 1992-1995 of the North American Business (NAB) [1] [2] [3] corpus using the Sphinx-II recognizer [4] are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Why speech recognizers make errors ? a robustness view

The performance of large vocabulary speech recognizers often varies depending on the input speech and the quality of the trained models. The particular attributes that cause recognition errors are a research area that has not been well studied. This paper addresses this issue from a robustness perspective using a large amount of field data collected from natural language dialog services. In par...

متن کامل

Japanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs

This paper proposes a spoken term detection using syllable transition network (STN) derived from multiple speech recognizers. An STN is similar to a sub-word based confusion network, which is derived from the output of a speech recognizer. The one we proposed is derived from the outputs of multiple speech recognition systems, which is well known to be robust to certain recognition errors and th...

متن کامل

Factorization of Language Constraints in Speech Recognition

Integration of language constraints into a large vocabulary speech recognition system often leads to prohibitive complexity. We propose to factor the constraints into two components. The first is characterized by a covering grammar which is small and easily integrated into existing speech recognizers. The recognized string is then decoded by means of an efficient language post-processor in whic...

متن کامل

Automatic Generation of Pronunciation Dictionaries

In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per t...

متن کامل

Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs

Spoken Term Detection (STD) that considers the out-of-vocabulary (OOV) problem has generated significant interest in the field of spoken document processing. This study describes STD with false detection control using phoneme transition networks (PTNs) derived from the outputs of multiple speech recognizers. PTNs are similar to subword-based confusion networks (CNs), which are originally derive...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997